Morphological constrained feature enhancement with adaptive cepstral compensation (MCE-ACC) for speech recognition in noise and Lombard effect
نویسنده
چکیده
The use of present-day speech recognition techniques in many practical applications has demonstrated the need for improved algorithm formulation under varying acoustical environments. This paper describes a low-vocabulary speech recognition algorithm that provides robust performance in noisy environments with particular emphasis on characteristics due to the Lombard effect. A neutral aod stressed-based source generator framework is established to achieve improved speech parameter characterization using a morphological constrained enhancement algorithm and stressed wurce compensation, which is unique for each source generator across a stressed speaking class. The algorithm uses a noise-adaptive boundary detector to obtain a sequence of source generator classes, which is used to direct noise parameter enhancement and stress compensation. This allows the parameter enhancement and stress compensation schemes to adapt to changing speech generator types. A phonetic consistency rule is also employed based on input source generator partitioning. Algorithm performance evaluation is demonstrated for noise-free and nine noisy Lomhard speech conditions that include additive white Gaussian noise, slowly vr~ying computer fan noise, and aircraft cockpit no&. System performane is compared with a traditional diserete-observation recognizer with no embellishments. Recognition rates are shown to increase from an average 36.7% for a baseline recognizer to 74.7% for the new algorithm (a 38% improvement). The new algorithm is also shown to be more consistent, as demonstrated by a decrease in standard deviation of recognition from 21.1 to 11.9 and a reduction in confusable word-pairs under noisy, Lombard-effect stressed speaking conditions.
منابع مشابه
Morphological Constrained Feature Enhancement with Adaptive Cepstral Compensation (mce-acc) for Speech Recognition in Noise and Lombard Eeect
The use of present day speech recognition techniques in many practical applications has demonstrated the need for improved algorithm formulation under varying acoustical environments. This paper describes a low-vocabulary speech recognition algorithm which provides robust performance in noisy environments with particular emphasis on characteristics due to Lombard eeect. A neutral and stressed b...
متن کاملAnalysis and compensation of speech under stress and noise for environmental robustness in speech recognition
It is well known that the introduction of acoustic background distortion and the variability resulting from environmentally induced stress causes speech recognition algorithms to fail. In this paper, several causes for recognition performance degradation are explored. It is suggested that recent studies based on a Source Generator Framework can provide a viable foundation in which to establish ...
متن کاملLombard effect compensation and noise suppression for noisy Lombard speech recognition
The performance of speech recognition system degrades rapidly in the presence of ambient noise. To reduce the degradation, a degradation model is proposed which represents the spectral changes of speech signal uttered in noisy environments. The model uses frequency warping and amplitude scaling of each frequency band to simulate the variations of formant location, formant bandwidth, pitch, spec...
متن کاملFront-End Compensation Methods for LVCSR Under Lombard Effect
This study analyzes the impact of noisy background variations and Lombard effect (LE) on large vocabulary continuous speech recognition (LVCSR). Robustness of several front-end feature extraction strategies combined with state-of-the-art feature distribution normalizations is tested on neutral and Lombard speech from the UT-Scope database presented in two types of background noise at various le...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Speech and Audio Processing
دوره 2 شماره
صفحات -
تاریخ انتشار 1994